Install requirements¶

In [ ]:
pip install -r requirements.txt

Import and initiate PeakCalling class¶

In [1]:
from peakcaller import PeakCalling

# Enter the number of reads for each dataset to normalize data
reads_count_1 = 1504149
reads_count_2 = 8991837

# Initiate the PeakCalling class
peak_calling = PeakCalling( \
    data_1='./data/coverage_16t5_plus_r209.txt', \
    data_2='./data/coverage_185_t5_sorted.txt', \
    threshold=0.6, \
    window_size=250,
    reads_count_1=reads_count_1, \
    reads_count_2=reads_count_2 \
)

Find significant changes. Write them to Pandas DataFrame¶

In [2]:
changes = peak_calling.find_significant_coverage_changes()
changes.head(10)
Out[2]:
Window Change Start_Pos End_Pos
18 18 0.912614 4501 4750
19 19 0.846189 4751 5000
20 20 0.663222 5001 5250
24 24 0.613345 6001 6250
50 50 0.789492 12501 12750
51 51 0.816089 12751 13000
64 64 0.896764 16001 16250
65 65 0.832993 16251 16500
66 66 0.641532 16501 16750
74 74 0.654182 18501 18750

Vizualize coverages and significant changes¶

In [3]:
peak_calling.visualize_coverage()

Match significant changes with genome annotation¶

In [4]:
gff_path = 'data/t5.gff3'
peak_calling.compare_coverage_changes_with_annotation(gff_annotation=gff_path)
/opt/anaconda3/envs/test/lib/python3.11/site-packages/genomenotebook/track.py:81: UserWarning:

You are trying to plot more than 10^5 glyphs, this might overflow your memory.         Consider using bounds or reducing the number of datapoints.

/opt/anaconda3/envs/test/lib/python3.11/site-packages/genomenotebook/track.py:81: UserWarning:

You are trying to plot more than 10^5 glyphs, this might overflow your memory.         Consider using bounds or reducing the number of datapoints.

Green Line: ./data/coverage_16t5_plus_r209.txt
Blue Line: ./data/coverage_185_t5_sorted.txt

Analyze results and investigate genes of interests¶

image

Gene of interests are genes of triggers of PARIS immune-defensive system.¶

Genes names according to annotation are T5.112 and T5.100¶